43 research outputs found

    The analog data assimilation

    Get PDF
    In light of growing interest in data-driven methods for oceanic, atmospheric, and climate sciences, this work focuses on the field of data assimilation and presents the analog data assimilation (AnDA). The proposed framework produces a reconstruction of the system dynamics in a fully data-driven manner where no explicit knowledge of the dynamical model is required. Instead, a representative catalog of trajectories of the system is assumed to be available. Based on this catalog, the analog data assimilation combines the nonparametric sampling of the dynamics using analog forecasting methods with ensemble-based assimilation techniques. This study explores different analog forecasting strategies and derives both ensemble Kalman and particle filtering versions of the proposed analog data assimilation approach. Numerical experiments are examined for two chaotic dynamical systems: the Lorenz-63 and Lorenz-96 systems. The performance of the analog data assimilation is discussed with respect to classical model-driven assimilation. A Matlab toolbox and Python library of the AnDA are provided to help further research building upon the present findings.Fil: Lguensat, Redouane. Université Bretagne Loire; FranciaFil: Tandeo, Pierre. Université Bretagne Loire; FranciaFil: Ailliot, Pierre. University of Western Brittany. Laboratoire de Mathématiques de Bretagne Atlantique; FranciaFil: Pulido, Manuel Arturo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Nordeste. Instituto de Modelado e Innovación Tecnológica. Universidad Nacional del Nordeste. Facultad de Ciencias Exactas Naturales y Agrimensura. Instituto de Modelado e Innovación Tecnológica; ArgentinaFil: Fablet, Ronan. Université Bretagne Loire; Franci

    Non-parametric Ensemble Kalman methods for the inpainting of noisy dynamic textures

    No full text
    International audienceIn this work, we propose a novel non parametric method for the temporally consistent inpainting of dynamic texture sequences. The inpainting of texture image sequences is stated as a stochastic assimilation issue, for which a novel model-free and data-driven Ensemble Kalman method is introduced. Our model is inspired by the Analog Ensemble Kalman Filter (AnEnKF) recently proposed for the assimilation of geophysical space-time dynamics, where the physical model is replaced by the use of statistical analogs or nearest neighbours. Such a non-parametric framework is of key interest for image processing applications, as prior models are seldom available in general. We present experimental evidence for real dynamic texture that using only a catalog database of historical data and without having any assumption on the model, the proposed method provides relevant dynamically-consistent interpolation and outperforms the classical parametric (autoregressive) dynamical prior

    Spatio-temporal interpolation of Sea Surface Temperature using high resolution remote sensing data

    No full text
    International audienceIn this work, we present a statistical model to generate relevant reanalysis of geophysical parameters. In particular, we use a stochastic equation to control the temporal and spatial variability of the signal and we take into account the possible error of the observations. We resolve the system iteratively using an ensemble Kalman filter and smoother. We apply the methodology to remote sensing data of Sea Surface Temperature (SST). We use high resolution SST maps provided by an infrared sensor, sensible to the presence of cloud. Comparing the results with the reference SST reanalysis, we demonstrate the capability of our approach to interpolate missing data and keep into account the spatial and temporal consistency of the SST signal

    The analog data assimilation: application to 20 years of altimetric data

    No full text
    International audienceThe reconstruction of geophysical dynamics remain key challenges in ocean, atmosphere and climate sciences. Data assimilation methods are the state-of-theart techniques to reconstruct the space-time dynamics from noisy and partial observations. They typically involve multiple runs of an explicit dynamical model and may have severe operational limitations, including the computational complexity, the lack of model consistante with respect to the observed data as well as modeling uncertainties. Here, we demonstrate how large amount of historical satellite data can open new avenues to address data assimilation issues, and to develop a fully data-driven assimilation. Assuming that a representative catalog of historical state trajectories is available, the key idea is to use the analog method to propose forecasts with no online evaluation of any physical model. The combination of these analog forecasts with observations resorts to classical stochastic filtering methods. For illustration of the proposed analog data assimilation, the brute force use of 20 years of altimetric data is demonstrated to reconstruct mesoscale sea surface dynamics

    A posteriori learning for quasi-geostrophic turbulence parametrization

    Full text link
    The use of machine learning to build subgrid parametrizations for climate models is receiving growing attention. State-of-the-art strategies address the problem as a supervised learning task and optimize algorithms that predict subgrid fluxes based on information from coarse resolution models. In practice, training data are generated from higher resolution numerical simulations transformed in order to mimic coarse resolution simulations. By essence, these strategies optimize subgrid parametrizations to meet so-called a priori\textit{a priori} criteria. But the actual purpose of a subgrid parametrization is to obtain good performance in terms of a posteriori\textit{a posteriori} metrics which imply computing entire model trajectories. In this paper, we focus on the representation of energy backscatter in two dimensional quasi-geostrophic turbulence and compare parametrizations obtained with different learning strategies at fixed computational complexity. We show that strategies based on a priori\textit{a priori} criteria yield parametrizations that tend to be unstable in direct simulations and describe how subgrid parametrizations can alternatively be trained end-to-end in order to meet a posteriori\textit{a posteriori} criteria. We illustrate that end-to-end learning strategies yield parametrizations that outperform known empirical and data-driven schemes in terms of performance, stability and ability to apply to different flow configurations. These results support the relevance of differentiable programming paradigms for climate models in the future.Comment: 36 pages, 14 figures, submitted to Journal of Advances in Modeling Earth Systems (JAMES

    Bridging observations, theory and numerical simulation of the ocean using machine learning

    Get PDF
    Progress within physical oceanography has been concurrent with the increasing sophistication of tools available for its study. The incorporation of machine learning (ML) techniques offers exciting possibilities for advancing the capacity and speed of established methods and for making substantial and serendipitous discoveries. Beyond vast amounts of complex data ubiquitous in many modern scientific fields, the study of the ocean poses a combination of unique challenges that ML can help address. The observational data available is largely spatially sparse, limited to the surface, and with few time series spanning more than a handful of decades. Important timescales span seconds to millennia, with strong scale interactions and numerical modelling efforts complicated by details such as coastlines. This review covers the current scientific insight offered by applying ML and points to where there is imminent potential. We cover the main three branches of the field: observations, theory, and numerical modelling. Highlighting both challenges and opportunities, we discuss both the historical context and salient ML tools. We focus on the use of ML in situ sampling and satellite observations, and the extent to which ML applications can advance theoretical oceanographic exploration, as well as aid numerical simulations. Applications that are also covered include model error and bias correction and current and potential use within data assimilation. While not without risk, there is great interest in the potential benefits of oceanographic ML applications; this review caters to this interest within the research community

    Apprentissage depuis les données de télédétection de l'océan

    Get PDF
    Reconstructing geophysical fields from noisy and partial remote sensing observations is a classical problem well studied in the literature. Data assimilation is one class of popular methods to address this issue, and is done through the use of classical stochastic filtering techniques, such as ensemble Kalman or particle filters and smoothers. They proceed by an online evaluation of the physical modelin order to provide a forecast for the state. Therefore, the performanceof data assimilation heavily relies on the definition of the physical model. In contrast, the amount of observation and simulation data has grown very quickly in the last decades. This thesis focuses on performing data assimilation in a data-driven way and this without having access to explicit model equations. The main contribution of this thesis lies in developing and evaluating the Analog Data Assimilation(AnDA), which combines analog methods (nearest neighbors search) and stochastic filtering methods (Kalman filters, particle filters, Hidden Markov Models). Through applications to both simplified chaotic models and real ocean remote sensing case-studies (sea surface temperature, along-track sea level anomalies), we demonstrate the relevance of AnDA for missing data interpolation of nonlinear and high dimensional dynamical systems from irregularly-sampled and noisy observations. Driven by the rise of machine learning in the recent years, the last part of this thesis is dedicated to the development of deep learning models for the detection and tracking of ocean eddies from multi-source and/or multi-temporal data (e.g., SST-SSH), the general objective being to outperform expert-based approaches.Reconstruire des champs gĂ©ophysiques Ă  partir d'observations bruitĂ©es et partielles est un problĂšme classique bien Ă©tudiĂ© dans la littĂ©rature. L'assimilation de donnĂ©es est une mĂ©thode populaire pour aborder ce problĂšme, et se fait par l'utilisation de techniques classiques, comme le filtrage de Kalman d’ensemble ou des filtres particulaires qui procĂšdent Ă  une Ă©valuation online du modĂšle physique afin de fournir une prĂ©vision de l'Ă©tat. La performance de l'assimilation de donnĂ©es dĂ©pend alors fortement de du modĂšle physique. En revanche, la quantitĂ© de donnĂ©es d'observation et de simulation a augmentĂ© rapidement au cours des derniĂšres annĂ©es. Cette thĂšse traite l'assimilation de donnĂ©es d'une maniĂšre data-driven et ce, sans avoir accĂšs aux Ă©quations explicites du modĂšle. Nous avons dĂ©veloppĂ© et Ă©valuĂ© l'assimilation des donnĂ©es par analogues (AnDA), qui combine la mĂ©thode des analogues et des mĂ©thodes de filtrage stochastiques (filtres Kalman, filtres Ă  particules, chaĂźnes de Markov cachĂ©es). Des applications aux modĂšles chaotiques simplifiĂ©s et Ă  des Ă©tudes de cas de tĂ©lĂ©dĂ©tection rĂ©elle (tempĂ©rature de surface de lamer, anomalies du niveau de la mer), nous dĂ©montrons la pertinence d'AnDA pour l'interpolation de donnĂ©es manquantes des systĂšmes dynamiques non linĂ©aires et Ă  haute dimension Ă  partir d'observations irrĂ©guliĂšres et bruyantes.MotivĂ© par l'essor du machine learning rĂ©cemment, la derniĂšre partie de cette thĂšse est consacrĂ©e Ă  l'Ă©laboration de modĂšles deep learning pour la dĂ©tection et de tourbillons ocĂ©aniques Ă  partir de donnĂ©es de sources multiples et/ou multi temporelles (ex: SST-SSH), l'objectif gĂ©nĂ©ral Ă©tant de surpasser les approches dites expertes

    Revealing the Impact of Global Heating on North Atlantic Circulation Using Transparent Machine Learning

    No full text
    International audienceThe North Atlantic ocean is key to climate through its role in heat transport and storage. Climate models suggest that the circulation is weakening but the physical drivers of this change are poorly constrained. Here, the root mechanisms are revealed with the explicitly transparent machine learning (ML) method Tracking global Heating with Ocean Regimes (THOR). Addressing the fundamental question of the existence of dynamical coherent regions, THOR identifies these and their link to distinct currents and mechanisms such as the formation regions of deep water masses, and the location of the Gulf Stream and North Atlantic Current. Beyond a black box approach, THOR is engineered to elucidate its source of predictive skill rooted in physical understanding. A labeled data set is engineered using an explicitly interpretable equation transform and k-means application to model data, allowing theoretical inference. A multilayer perceptron is then trained, explaining its skill using a combination of layerwise relevance propagation and theory. With abrupt CO2 quadrupling, the circulation weakens due to a shift in deep water formation regions, a northward shift of the Gulf Stream and an eastward shift in the North Atlantic Current. If CO2 is increased 1% yearly, similar but weaker patterns emerge influenced by natural variability. THOR is scalable and applicable to a range of models using only the ocean depth, dynamic sea level and wind stress, and could accelerate the analysis and dissemination of climate model data. THOR constitutes a step toward trustworthy ML called for within oceanography and beyond, as its predictions are physically tractable
    corecore